English-Korean Named Entity Transliteration Using Substring Alignment and Re-ranking Methods
نویسندگان
چکیده
In this paper, we describe our approach to English-to-Korean transliteration task in NEWS 2012. Our system mainly consists of two components: an letter-to-phoneme alignment with m2m-aligner,and transliteration training model DirecTL-p. We construct different parameter settings to train several transliteration models. Then, we use two reranking methods to select the best transliteration among the prediction results from the different models. One re-ranking method is based on the co-occurrence of the transliteration pair in the web corpora. The other one is the JLIS-Reranking method which is based on the features from the alignment results. Our standard and non-standard runs achieves 0.398 and 0.458 in top-1 accuracy in the generation task.
منابع مشابه
English-Korean Named Entity Transliteration Using Statistical Substring-based and Rule-based Approaches
This paper describes our approach to English-Korean transliteration in NEWS 2011 Shared Task on Machine Transliteration. We adopt the substring-based transliteration approach which group the characters of named entity in both source and target languages into substrings and then formulate the transliteration as a sequential tagging problem to tag the substrings in the source language with the su...
متن کاملMachine Transliteration Using Multiple Transliteration Engines and Hypothesis Re-Ranking
This paper describes a novel method of improving machine transliteration by using multiple transliteration hypotheses and re-ranking them. We constructed seven machine-transliteration engines to produce a set of transliteration hypotheses. We then re-ranked the hypotheses to select the correct transliteration hypothesis. We propose a re-ranking method that makes use of confidence-score, languag...
متن کاملA Hybrid Approach to English-Korean Name Transliteration
This paper presents a hybrid approach to English-Korean name transliteration. The base system is built on MOSES with enabled factored translation features. We expand the base system by combining with various transliteration methods including a Web-based n-best re-ranking, a dictionary-based method, and a rule-based method. Our standard run and best nonstandard run achieve 45.1 and 78.5, respect...
متن کاملNCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches
This paper describes our approach to English-Korean and English-Chinese transliteration task of NEWS 2015. We use different grapheme segmentation approaches on source and target languages to train several transliteration models based on the M2M-aligner and DirecTL+, a string transduction model. Then, we use two reranking techniques based on string similarity and web co-occurrence to select the ...
متن کاملEnglish-to-Chinese Machine Transliteration using Accessor Variety Features of Source Graphemes
This work presents a grapheme-based approach of English-to-Chinese (E2C) transliteration, which consists of many-to-many (M2M) alignment and conditional random fields (CRF) using accessor variety (AV) as an additional feature to approximate local context of source graphemes. Experiment results show that the AV of a given English named entity generally improves effectiveness of E2C transliteration.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012